A Bayesian Network approach to County-Level Corn Yield Prediction using historical data and expert knowledge
نویسندگان
چکیده
Crop yield forecasting is the methodology of predicting crop yields prior to harvest. The availability of accurate yield prediction frameworks have enormous implications from multiple standpoints, including impact on the crop commodity futures markets, formulation of agricultural policy, as well as crop insurance rating. The focus of this work is to construct a corn yield predictor at the county scale. Corn yield (forecasting) depends on a complex, interconnected set of variables that include economic, agricultural, management and meteorological factors. Conventional forecasting is either knowledge-based computer programs (that simulate plant-weather-soil-management interactions) coupled with targeted surveys or statistical model based. The former is limited by the need for painstaking calibration, while the latter is limited to univariate analysis or similar simplifying assumptions that fail to capture the complex interdependencies affecting yield. In this paper, we propose a data-driven approach that is ‘gray box’ i.e. that seam∗The presenting author is an early researcher who wishes to be considered for the travel grant option. †Corresponding author Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. KDD ’16 Workshop: Data Science for Food, Energy and Water August 14, 2016, San Francisco, CA, USA c © 2016 ACM. ISBN 978-1-4503-2138-9. DOI: 10.1145/1235 lessly utilizes expert knowledge in constructing a statistical network model for corn yield forecasting. Our multivariate gray box model is developed on Bayesian network analysis to build a Directed Acyclic Graph (DAG) between predictors and yield. Starting from a complete graph connecting various carefully chosen variables and yield, expert knowledge is used to prune or strengthen edges connecting variables. Subsequently the structure (connectivity and edge weights) of the DAG that maximizes the likelihood of observing the training data is identified via optimization. We curated an extensive set of historical data (1948− 2012) for each of the 99 counties in Iowa as data to train the model. We discuss preliminary results, and specifically focus on (a) the structure of the learned network and how it corroborates with known trends, and (b) how partial information still produces reasonable predictions (predictions with gappy data), and show that incorporating the missing information improves predictions.
منابع مشابه
Empirical Studies of Corn Yield Distribution Modeling
This research paper uses partially historical corn yields of each county over five states as training set for calibrating alternative yield models including single parametric distributions, nonparametric distributions, semiparametric distributions, mixing models, and other nontraditional models. The calibrated models are applied to predict the yield of the most recent eight years in historical ...
متن کاملA Bayesian Networks Approach to Reliability Analysis of a Launch Vehicle Liquid Propellant Engine
This paper presents an extension of Bayesian networks (BN) applied to reliability analysis of an open gas generator cycle Liquid propellant engine (OGLE) of launch vehicles. There are several methods for system reliability analysis such as RBD, FTA, FMEA, Markov Chains, and etc. But for complex systems such as LV, they are not all efficiently applicable due to failure dependencies between compo...
متن کاملThe effects of climatic hazards on agricultural activities (bean cultivation) of villagers in Azna County
Introduction Agriculture is the main axis of economy and development of rural areas of Azna County, which has an important role in the performance of the village and the rural environment. Regarding to natural environment and climatic factors, performance evaluation of agricultural products is one of the important pillars of sustainability of food supply and rural economy. Climatic hazards suc...
متن کاملA data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing)
Training and adaption of employees are time and money consuming. Employees’ turnover can be predicted by their organizational and personal historical data in order to reduce probable loss of organizations. Prediction methods are highly related to human resource management to obtain patterns by historical data. This article implements knowledge discovery steps on real data of a manufacturing pla...
متن کاملتعیین عوامل خطرزا و ارایه مدل پیشآگهی آمبولی ریه بیماران بستری با استفاده از شبکههای بیزی
Background and Objectives: Pulmonary embolism is a potentially fatal and prevalent event that has led to a gradual increase in the number of hospitalizations in recent years. For this reason, it is one of the most challenging diseases for physicians. The main purpose of this paper was to report a research project to compare different data mining algorithms to select the most accurate model for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1608.05127 شماره
صفحات -
تاریخ انتشار 2016